c-lasso - a Python package for constrained sparse and robust regression and classification

نویسندگان

چکیده

We introduce c-lasso, a Python package that enables sparse and robust linear regression classification with equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X \beta + \sigma \epsilon \qquad \textrm{subject to} C\beta=0 \] Here, $X \in \mathbb{R}^{n\times d}$is given design matrix vector $y \mathbb{R}^{n}$ continuous or binary response vector. $C$ general constraint matrix. $\beta \mathbb{R}^{d}$ contains unknown coefficients $\sigma$ an scale. Prominent use cases are (sparse) log-contrast compositional data $X$, requiring $1_d^T 0$ (Aitchion Bacon-Shone 1984) Generalized Lasso which special case described problem (see, e.g, (James, Paulson, Rusmevichientong 2020), Example 3). c-lasso provides estimators for inferring scale (i.e., perspective M-estimators (Combettes Muller 2020a)) form \min_{\beta \mathbb{R}^d, \mathbb{R}_{0}} f\left(X\beta - y,{\sigma} \right) \lambda \left\lVert \beta\right\rVert_1 C\beta 0 several convex loss functions $f(\cdot,\cdot)$. This includes constrained Lasso, scaled Huber

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Robust Estimation in Linear Regression with Molticollinearity and Sparse Models

‎One of the factors affecting the statistical analysis of the data is the presence of outliers‎. ‎The methods which are not affected by the outliers are called robust methods‎. ‎Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers‎. ‎Besides outliers‎, ‎the linear dependency of regressor variables‎, ‎which is called multicollinearity...

متن کامل

the innovation of a statistical model to estimate dependable rainfall (dr) and develop it for determination and classification of drought and wet years of iran

آب حاصل از بارش منبع تأمین نیازهای بی شمار جانداران به ویژه انسان است و هرگونه کاهش در کم و کیف آن مستقیماً حیات موجودات زنده را تحت تأثیر منفی قرار می دهد. نوسان سال به سال بارش از ویژگی های اساسی و بسیار مهم بارش های سالانه ایران محسوب می شود که آثار زیان بار آن در تمام عرصه های اقتصادی، اجتماعی و حتی سیاسی- امنیتی به نحوی منعکس می شود. چون میزان آب ناشی از بارش یکی از مولفه های اصلی برنامه ...

15 صفحه اول

pyGPs: a Python library for Gaussian process regression and classification

We introduce pyGPs, an object-oriented implementation of Gaussian processes (gps) for machine learning. The library provides a wide range of functionalities reaching from simple gp specification via mean and covariance and gp inference to more complex implementations of hyperparameter optimization, sparse approximations, and graph based learning. Using Python we focus on usability for both “use...

متن کامل

Quantile regression with group lasso for classification

Applications of regression models for binary response are very common and models specific to these problems are widely used. Quantile regression for binary response data has recently attracted attention and regularized quantile regression methods have been proposed for high dimensional problems. When the predictors have a natural group structure, such as in the case of categorical predictors co...

متن کامل

Robust and sparse bridge regression

It is known that when there are heavy-tailed errors or outliers in the response, the least squares methods may fail to produce a reliable estimator. In this paper, we proposed a generalized Huber criterion which is highly flexible and robust for large errors. We applied the new criterion to the bridge regression family, called robust and sparse bridge regression (RSBR). However, to get the RSBR...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: Journal of open source software

سال: 2021

ISSN: ['2475-9066']

DOI: https://doi.org/10.21105/joss.02844